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Abstract 

We present a new, real-time method for rendering diffuse and 
glossy objects in low-frequency lighting environments that cap- 
tures soft shadows, interreflections, and caustics. As a preprocess, 
a novel global transport simulator creates functions over the 
object’s surface representing transfer of arbitrary, low-frequency 
incident lighting into transferred radiance which includes global 
effects like shadows and interreflections from the object onto 
itself. At run-time, these transfer functions are applied to actual 
incident lighting. Dynamic, local lighting is handled by sampling 
it close to the object every frame; the object can also be rigidly 
rotated with respect to the lighting and vice versa. Lighting and 
transfer functions are represented using low-order spherical 
harmonics. This avoids aliasing and evaluates efficiently on 
graphics hardware by reducing the shading integral to a dot 
product of 9 to 25 element vectors for diffuse receivers. Glossy 
objects are handled using matrices rather than vectors. We further 
introduce functions for radiance transfer from a dynamic lighting 
environment through a preprocessed object to neighboring points 
in space. These allow soft shadows and caustics from rigidly 
moving objects to be cast onto arbitrary, dynamic receivers. We 
demonstrate real-time global lighting effects with this approach. 
Keywords: Graphics Hardware, Illumination, Monte Carlo Techniques, 
Rendering, Shadow Algorithms. 

1. Introduction 

Lighting from area sources, soft shadows, and interreflections are 
important effects in realistic image synthesis. Unfortunately, 
general methods for integrating over large-scale lighting environ- 
ments [8], including Monte Carlo ray tracing [7][21][25], rad- 
iosity [6], or multi-pass rendering that sums over multiple point 
light sources [17][27][36], are impractical for real-time rendering. 
Real-time, realistic global illumination encounters three difficul- 
ties - it must model the complex, spatially-varying BRDFs of real 
materials ( BRDF complexity ), it requires integration over the 
hemisphere of lighting directions at each point (light integration ), 
and it must account for bouncing/occlusion effects, like shadows, 
due to intervening matter along light paths from sources to receiv- 
ers (light transport complexity). Much research has focused on 
extending BRDF complexity (e.g., glossy and anisotropic reflec- 
tions), solving the light integration problem by representing 
incident lighting as a sum of directions or points. Light integra- 
tion thus tractably reduces to sampling an analytic or tabulated 
BRDF at a few points, but becomes intractable for large light 
sources. A second line of research samples radiance and pre- 
convolves it with kernels of various sizes [5] [14] [19] [24] [34]. 
This solves the light integration problem but ignores light trans- 
port complexities like shadows since the convolution assumes the 
incident radiance is unoccluded and unscattered. Finally, clever 
techniques exist to simulate more complex light transport, espe- 
cially shadows. Light integration becomes the problem; these 
techniques are impractical for very large light sources. 

Our goal is to better account for light integration and light trans- 
port complexity in real-time. Our compromise is to focus on low- 




Figure 1: Precomputed, unshadowed irradiance from [34] (left) vs. our 
precomputed transfer (right). The right model can be rendered at 1 29Hz 
with self-shadowing and self-interTeflection in any lighting environment 



frequency lighting environments, using a low-order spherical 
harmonic (SH) basis to represent such environments efficiently 
without aliasing. The main idea is to represent how an object 
scatters this light onto itself or its neighboring space. 

To describe our technique, assume initially we have a convex, 
diffuse object lit by an infinitely distant environment map. The 
object’s shaded “response” to its environment can be viewed as a 
transfer function , mapping incoming to outgoing radiance, which 
in this case simply performs a cosine-weighted integral. A more 
complex integral captures how a concave object shadows itself, 
where the integrand is multiplied by an additional transport factor 
representing visibility along each direction. 

Our approach is to precompute for a given object the expensive 
transport simulation required by complex transfer functions like 
shadowing. The resulting transfer functions are represented as a 
dense set of vectors or matrices over its surface. Meanwhile, 
incident radiance need not be precomputed. The graphics hard- 
ware can dynamically sample incident radiance at a number of 
points. Analytic models, such as skylight models [33] or simple 
geometries like circles, can also be used. 

By representing both incident radiance and transfer functions in a 
linear basis (in our case, SH), we exploit the linearity of light 
transport to reduce the light integral to a simple dot product 
between their coefficient vectors (diffuse receivers) or a simple 
linear transform of the lighting coefficient vector through a small 
transfer matrix (glossy receivers). Low-frequency lighting envi- 
ronments require few coefficients (9-25), enabling graphics 
hardware to compute the result in a single pass (Figure 1, right). 
Unlike Monte-Carlo and multi-pass light integration methods, our 
run-time computation stays constant no matter how many or how 
big the light sources, and in fact relies on large-scale, smooth 
lighting to limit the number of SH coefficients necessary. 

We represent complex transport effects like interreflections and 
caustics in the transfer function. Since these are simulated as a 
preprocess, only the transfer function’s basis coefficients are 
affected, not the run-time computation. Our approach handles 
both surface and volume-based geometry. With more SH coeffi- 
cients, we can even handle glossy (but not highly specular) 
receivers as well as diffuse, including interreflection. 25 coeffi- 
cients suffice for useful glossy effects. In addition to transfer 
from a rigid object to itself, called self -transfer, we generalize the 
technique to neighborhood-transfer from a rigid object to its 
neighboring space, allowing cast soft shadows, glossy reflections, 
and caustics on dynamic receivers, see Figure 7. 




diffuse surface self-transfer 




glossy surface self-transfer 



Figure 2: Self-Transfer Run-Time Overview. Red signifies positive SH coefficients 
and blue, negative. For a diffuse surface (top row), the SH lighting coefficients (on the 
left) modulate a field of transfer vectors over the surface (middle) to produce the final 
result (right). A transfer vector at a particular point on the surface represents how the 
surface responds to incident light at that point, including global transport effects like 
self-shadowing and self-interreflection.. For a glossy surface (bottom row), there is a 
matrix at each point on the model instead of a vector. This matrix transforms the light- 
ing coefficients into the coefficients of a spherical function representing transferred 
radiance. The result is convolved with the model’s BRDF kernel and evaluated at the 
view-dependent reflection direction R to yield the result at one point on the model. 



Overview As a preprocess, a global illumination 
simulator is run over the model that captures how it 
shadows and scatters light onto itself. The result is 
recorded as a dense set of vectors (diffuse case) or 
matrices (glossy case) over the model. At run-time 
(Figure 2), incident radiance is first projected to the SH 
basis. The model’s field of transfer vectors or matrices 
is then applied to the lighting’s coefficient vector. If the 
object is diffuse, a transfer vector at each point on the 
object is dotted with the lighting’s coefficients to 
produce correctly self-scattered shading. If the object is 
glossy, a transfer matrix is applied to the lighting 
coefficients to produce the coefficients of a spherical 
function representing self-scattered incident radiance at 
each point. This function is convolved with the object’s 
BRDF and then evaluated at the view-dependent reflec- 
tion direction to produce the final shading. 

2. Related Work 

Scene relighting precomputes a separate global illumi- 
nation solution per light source as we do; linear 
combinations of the results then provide limited dy- 
namic effects. Early work [2][11] adjusts intensities of 
a fixed set of sources and is not intended to fit general 
lighting environments. Nimeroff, et al. [33] precompute 
a “steerable function” basis for general skylight illumi- 
nation on a fixed view. Their basis, essentially the 
spherical monomials, is related to the SH by a linear 
transformation and thus shares some of its properties 
(e.g., rotational invariance) but not others (e.g., or- 
thonormality). Teo, et al. [40] generalize to non-infinite 
sources, using principal component analysis to reduce 
the basis set. Our work differs by computing a transfer 
field over the object’s surface in 3D rather than over a 
fixed 2D view to allow viewpoint changes. Dobashi, et 
al. [10] use the SH basis and transfer vector fields over 
surfaces to allow viewpoint change but restrict lighting 
changes to the directional intensity distribution of an 
existing set of non-area light sources in diffuse scenes. Debevec, 
et al. [9] relight faces using a directional light basis. Real-time 
rendering requires a fixed view. 

Shadow maps, containing depths from the light source’s point of 
view, were first used by Williams [43] to simulate point light 
source shadows. Many extensions of the basic technique, some 
suitable for real-time rendering, have since been described: 
percentage-closer filtering [35], which softens shadow edges, 
layered depth maps [26] and layered attenuation maps [1], which 
more accurately simulate penumbra shape and falloff, and deep 
shadow maps [29], which generalize the technique to partially 
transparent and volume geometry. All these techniques assume 
point or at least localized light sources; shadowing from larger 
light sources has been handled by multi-pass rendering that sums 
over a light source decomposition into points or small sources 
[17] [27] [3 6]. Large light sources become very expensive. 
Another technique [39] uses FFT convolution of occluder projec- 
tions for soft shadowing with cost independent of light source 
size. Only shadows between pre-segmented clusters of objects 
are handled, making self-shadows on complex meshes difficult. 
Finally, accessibility shading [32] is also based on precomputed 
global visibility, but is a scalar quantity that ignores changes in 
lighting direction. 

Methods for nonlocal lighting on micro-geometry include the 
horizon map [4] [31], which efficiently renders self-shadowing 
from point lights. In [20], this technique is tailored to graphics 
hardware and generalized to diffuse interreflections, though 



interreflection change due to dynamic lighting is still not real- 
time. By precomputing a higher-dimensional texture, polynomial 
texture maps [30] allow real-time interreflection effects as well as 
shadowing. A similar approach using a steerable basis for direc- 
tional lighting is used in [3]. Like our approach, these methods 
precompute a simple representation of a transfer function, but one 
based on directional light sources and thus requiring costly multi- 
pass integration to simulate area lights. We compute self-transfer 
directly on each preprocessed 3D object rather than mapping it 
with 2D micro-geometry textures, allowing more global effects. 
Finally, our neighborhood transfer extends these ideas to cast 
shadows, caustics, and reflections. 

Caching onto diffuse receivers is useful for accelerating global 
illumination. Ward et. al. [41] perform caching to simulate 
diffuse interreflection in a ray tracer. Photon maps [21] also 
cache but perform forward ray tracing from light sources rather 
than backwards from the eye, and handle specular bounces in the 
transport (as does our approach). We apply this caching idea to 
real-time rendering, but cache a transfer function parameterized 
by a SH lighting basis rather than scalar irradiance. 

Precomputed transfer using light-field remapping [18] and 
dynamic ray tracing [16] has been used to achieve highly specular 
reflections and refractions. We apply a similar precomputed, per- 
object decomposition but designed instead for soft shadows and 
caustics on nearly diffuse objects in low-frequency lighting. 
Ircadiance volumes [15] allow movement of diffuse receivers in 
precomputed lighting. Unlike our approach, lighting is static and 
the receiver’s effect on itself and its environment is ignored. 




Spherical harmonics have been used to represent incident radi- 
ance and BRDFs for offline rendering and BRDF inference [4] 
[3 8] [42]. Westin, et al. [42] use a matrix representation for 4D 
BRDF functions in terms of the SH basis identical to our transfer 
matrix. But rather than the BRDF, we use it to represent global 
and spatially varying transport effects like shadows. The SH basis 
has iso been used to solve ambiguity problems in computer 
vision [12] and to represent irradiance for rendering [34]. 



3. Review of Spherical Harmonics 



Definition Spherical harmonics define an orthonormal basis over 
the sphere, 5, analogous to the Fourier transform over the ID 
circle. Using the parameterization 

s = (x,y,z) = (sin0 cosp, sin0 sin cp, cos0) , 



the basis functions are defined as 

Y t m (O t q>) = K? e^/> w (cos0), / e TV, -/ <m< l 
where P /” are the associated Legendre polynomials and Kp are 
the normalization constants 
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The above definition forms a complex basis; a real-valued basis is 
given by the simple transformation 
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Low values of / (called the band index) represent low-frequency 
basis functions over the sphere. The basis functions for band / 
reduce to polynomials of order / in x, y, and z. Evaluation can be 
done with simple recurrence formulas [13][44]. 

Projection and Reconstruction Because the SH basis is or- 
thonormal, a scalar function / defined over S can be projected 
into its coefficients via the integral 

fr=\m y T{s)ds (i) 

These coefficients provide the n-th order reconstruction function 

/(*)-£ ZJTrfw {2) 

i =o «=-/ 

which approximates / increasingly well as the number of bands n 
increases. Low-frequency signals can be accurately represented 
with only a few SH bands. Higher frequency signals are bandlim- 
ited (i.e., smoothed without aliasing) with a low-order projection. 
Projection to n-th order involves n 2 coefficients. It is often con- 
venient to rewrite (2) in terms of a singly-indexed vector of 
projection coefficients and basis functions, via 

where *-/(/+ l)+/w+l. This formulation makes it obvious that 
evaluation at s of the reconstruction function represents a simple 
dot product of the /^-component coefficient vector f with the 
vector of evaluated basis functions 

Basic Properties A critical property of SH projection is its 
rotational invariance; that is, given g(s) = /({?($)) where Q is an 
arbitrary rotation over S then 

s«»/(GW) ( 4 ) 

This is analogous to the shift-invariance property of the ID 
Fourier transform. Practically, this property means that SH 
projection causes no aliasing artifacts when samples from / are 
collected at a rotated set of sample points. 

Orthonormality of the SH basis provides the useful property that 
given any two functions a and b over 5, their projections satisfy 



\a(s)b{s)ds = j^a l b, . (5) 

1-1 

In other words, integration of the product of bandlimited functions 
reduces to a dot product of their projection coefficients. 
Convolution We denote convolution of a circularly symmetric 
kernel function h(z) with a function /as h * f . Note that h must 
be circularly symmetric (and hence can be defined as a simple 
function of z rather than s ) in order for the result to be defined on 
S rather than the higher-dimensional rotation group 50(3). 
Projection of the convolution satisfies 

(h * f)", = ti f = <*? if rr . (6) 

In other words, the coefficients of the projected convolution are 
simply scaled products of the separately projected functions. 
Note that because h is circularly symmetric about z, its projection 
coefficients are nonzero only for m= 0. The convolution property 
provides a fast way to convolve an environment map with a 
hemispherical cosine kernel, defined as h(z) = max(z,0) , to get 
an irradiance map [34], for which the are given by an analytic 
formula. The convolution property can also be used to produce 
prefiltered environment maps with narrower kernels. 

Product Projection Projection of the product of a pair of spheri- 
cal functions c(s) = a{s)b(s) where a is known and b unknown 
can be viewed as a linear transformation of the projection coeffi- 
cients bj via a matrix a : 

= (j^ a{s)y,(s)yj(s)ds^bj 

where summation is implied over the duplicated j and k indices. 
Note that a is a symmetric matrix. The components of a can be 
computed by integrating the triple product of basis functions using 
recurrences derived from the well-known Clebsch-Gordan series 
[13][44]. It can also be computed using numerical integration 
without SH-projecting the function a beforehand. Note that the 
product's order n projection involves coefficients of the two factor 
functions up to order 2nA . 

Rotation A reconstruction function rotated by Q , /(C?^))* can 
be projected into SH using a linear transformation of/s projec- 
tion coefficients, /. Because of the rotation invariance property, 
this linear transformation treats the coefficients in each band 
independently. The most efficient implementation is achieved via 
a zyz Euler angle decomposition of the rotation Q , using a fairly 
complicated recurrence formula [13][44], Because we deal only 
with low-order functions, we have implemented their explicit 
rotation formulas using symbolic integration. 

4. Radiance Self-Transfer 

Radiance self-transfer encapsulates how an object 0 shadows and 
scatters light onto itself. To represent it, we first parameterize 
incident lighting at points peO , denoted L p (s ), using the SH 
basis. Incident lighting is therefore represented as a vector of n 2 
coefficients (L p )\. We sample the lighting dynamically and 
sparsely near the surface, perhaps at only a single point. The 
assumption is that lighting variation over 0 not due to its own 
presence is small (see Section 6.1). We also precompute and 
store densely over 0 transfer vectors or matrices. 

A transfer vector (M p ) t is useful for diffuse surfaces and repre- 
sents a linear transformation on the lighting vector producing 
scalar exit radiance, denoted L ' p , via the inner product 

(i,), . (8) 

#= i 





(a) unshadowed (b) shadowed (c) interre fleeted 

Figure 3: Diffuse Surface Self-transfer. 



In other words, each component of (M p )j represents the linear 
influence that a lighting basis function ( L p )\ has on shading at p. 

A transfer matrix (AT),, is useful for glossy surfaces and repre- 
sents a linear transformation on the lighting vector which 
produces projection coefficients for an entire spherical function of 
transferred radiance L p ( 5 ) rather than a scalar; i.e., 

(£;),= • ( 9 > 
J 9 1 

The difference between incident and transferred radiance is that 
LAs) includes shadowing/scattering effects due to the presence 
01 O while L p (s) represents incident lighting assuming 0 was 
removed from the scene. Components of (M )^ represent the 
linear influence of the y-th lighting coefficient 01 incident radiance 
(L p )j to the z-th coefficient of transferred radiance (L p ) f . The next 
sections derive transfer vectors for diffuse surfaces and transfer 
matrixes for glossy surfaces due to self-scattering on O. 

4.1 Diffuse Transfer [transfer vector for known normal] 

First assume O is diffrise. The simplest transfer function at a 
point peO represents unshadowed diffuse transfer , defined as the 
scalar function 

Tov(.L f ) = {p f /x)lL p (s)H Np (s)ds 

producing exit radiance which is invariant with view angle for 
diffuse surfaces. Here, p p is the object’s albedo at p , L p is the 
incident radiance at p assuming O was removed from the scene, 
N p is the object’s normal at p , and H N (s) = max(N p *s,0) is the 
cosine-weighted, hemispherical kernel about N p . By SH- 

projecting L p and H Np separately, equation (5) reduces T DV to an 
inner product of their coefficient vectors. We call the resulting 
factors the light function , L p and transfer function, M p . In this 
first simple case, M p u (5) = H Np (s ) . 

Because N p is known, the SH-projection of the transfer function 
can be precomputed, resulting in a transfer vector. In 
fact, storing is unnecessary because a simple analytic formula 
yields it given N p . Because M p u is inherently a low-pass filter, 
second-order projection (9 coefficients) provides good accuracy in 
an arbitrary (even non-smooth) lighting environment [34]. 

To include shadows, we define shadowed diffuse transfer as 
Tos(L p ) = (p,/w) j L p (s)H Np (s)Y p (s)<is 

where the additional visibility function, V p (s) — » {0,1} , equals 1 
when a ray from p in the direction s fails to intersect O again (i.e., 
is unshadowed). As with unshadowed transfer, we decompose 
this integral into two functions, using an SH-projection of L p and 
the transfer function 

M™(s) = H Sp (s)V p (s) . (10) 

Separately SH-projecting L p and M p again 
reduces the integral in T DS to an inner 
product of coefficient vectors. 

Transfer is now nontrivial; we precompute 
it using a transport simulator (Section 5), 
storing the resulting transfer vector (M p ), at 
many points p over O. Unlike the previous 
case, second-order projection of may 



be inaccurate even for smooth lighting environments since V p can 
create higher- frequency lighting locally, e.g., by self-shadowing 
“pinholes”. 4-th or 5-th order projection provides good results on 
typical meshes in smooth lighting environments. 

Finally, to capture diffuse interreflections as well as shadows, we 
define interreflected diffuse transfer as 

T a (l p ) = T DS (L p ) + { Pp ln)\l p {s) H„ p (s) 

where Z^jJis the radiance from O itself in the direction s. The 
difficulty is that unless the incident radiance emanates from an 
infinitely-distant source, we don’t actually know L (s) given the 
incident radiance only at p because L p depends on the exit 
radiance of points arbitrarily far from p and local lighting varies 
over O. If lighting variation is small over O then L is well- 
approximated as if O were everywhere illuminated by L p . T D1 
thus depends linearly on L p and can be factored as in the previous 
two cases into a product of two projected functions: one light- 
dependent and the other geometry-dependent. 

Though precomputed interreflections must make the assumption 
of spatially invariant incident lighting over O , simpler shadowed 
transfer need not. The difference is that shadowed transfer de- 
pends only on incident lighting at /?, while interreflected transfer 
depends on many points q* p over O at which L q * L p . Thus, 
as long as the incident radiance field is sampled finely enough 
(Section 6.1), local lighting variation can be captured and shad- 
owed transfer will be correct. 

The presence of L makes it hard to explicitly denote the transfer 
function for interreflections, M p J ( 5 ) . We will see how to com- 
pute its projection coefficients numerically in Section 5. 

4.2 Glossy Transfer [transfer matrix for unknown direction] 
Self-transfer for glossy objects can be defined similarly, but 
generalizes the kernel function to depend on a (view-dependent) 
reflection direction R rather than a (fixed) normal N. Analogous 
to the H kernel from before, we model glossy reflection as the 
kernel G(s,R,r) where a scalar r defines the “glossiness” or broad- 
ness of the specular response. We believe it is possible to handle 
arbitrary BRDFs as well using their SH projection coefficients 
[38] but this remains for future work. 

We can then define the analogous three glossy transfer functions 
for the unshadowed, shadowed, and interreflected cases as 

T GU (L p ,R,r) = \L p (s)G(s,R,r)ds 

T cs (L p ,R,r) = lL p (s)G(s,R,r)V p {s)ds 

T al (L p ,R,r ) = 7^(1,) + G(s,R,r) (l - 

which output scalar radiance in direction R as a function of L p and 
Ry quantities both unknown at precomputation time. Since trans- 
fer is no longer solely a function of s , it can’t be reduced to a 
simple vector of SH coefficients 

Instead of parameterizing scalar transfer by R and r, a more useful 
decomposition is to transfer the incident radiance L p (s) into a 
whole sphere of transferred radiance, denoted V (j) . Assuming 
the glossy kernel G is circularly symmetric about R (i.e., a simple 
Phong-like model) L p (s) can then be convolved with 
G*(z) = G(s, (0,0,1), r) and evaluated at R to produce the final 






result (see bottom of Figure 2, and further details in Section 6). 
Transfer to L p can now be represented as a matrix rather than a 
vector. For example, glossy shadowed transfer is 

M?(L p ,s) = L p (s)V p (s) (11) 

a linear operator on L p whose SH-projection can be represented as 
the symmetric matrix F^via equation (7). Even with smooth 
lighting, more SH bands must be used for L p as O’s glossiness 
increases; non-square matrices (e.g., 25x9) mapping low- 
frequency lighting to higher-frequency transferred radiance are 
useful under these conditions. For shadowed glossy transfer (but 
not interreflected), an alternative still uses a vector rather than a 
matrix to represent M™ by computing the product of V p with L p 
on-the-fly using the tabulated triple product of basis functions in 
equation (7). We have not yet implemented this alternative. 

4.3 Limitations and Discussion 

An important limitation of precomputed transfer is that material 
properties of 0 influencing interreflections in Tpj and T GJ (like 
albedo or glossiness) must be “baked in” to the preprocessed 
transfer and can’t be changed at run-time. On the other hand, the 
simpler shadowed transfer without interreflection does allow run- 
time change and/or spatial variation over 0 of the material proper- 
ties. Error arises if blockers or light sources intrude into O’s 
convex hull. O can only move rigidly, not deform or move one 
component relative to the whole. Recall also the assumption of 
low lighting variation over O required for correct interreflections. 
Finally, note that diffuse transfer as defined produces radiance 
after leaving the surface , since it has already been convolved with 
the cosine-weighted normal hemisphere, while glossy transfer 
produces radiance incident on the surface and must be convolved 
with the local BRDF to produce the final exit radiance. It’s also 
possible to bake in a fixed BRDF for glossy 0, making the convo- 
lution with G unnecessary at run-time but limiting flexibility. 

5. Precomputing Radiance Self-Transfer 

As a preprocess, we perform a global illumination simulation over 
an object O using the SH basis over the infinite sphere as emitters. 
Our light gathering solution technique is a straightforward adapta- 
tion of existing approaches [7] [25] and could be accelerated in 
many ways; its novelty lies in how it parameterizes the lighting 
and collects the resulting integrated transfers. Note that all 
integrated transfer coefficients are signed quantities. 

The simulation is parameterized by an n-th order SH projection of 
the unknown sphere of incident light L; i.e., by n 2 unknown 
coefficients L h Though the simulation results can be computed 
independently for each L f using the SH basis function y/(s) as an 
emitter, it is more efficient to compute them all at once. The 
infinitely-distant sphere L will be replaced at run-time by the 
actual incident radiance field around 0, L p . 

An initial pass simulates direct shadows from paths leaving L and 
reaching sample points p e 0 . In subsequent passes, interreflec- 
tions are added, representing paths from L that bounce a number 
of times off 0 before arriving at p (Lp, LDp , LDDp , etc.). In each 
pass, energy is gathered to every sample point p. Large emitters 
(i.e., low-frequency SH basis) make a gather more efficient then a 
shooting-style update [6]. Note that this idea of caching onto 
diffuse (or nearly diffuse) receivers is not new [21][41]. 

To capture the sphere of directions at sample points peO , we 
generate a large (10k-30k), quasi-random set of directions {^}, 
s d eS . We also precompute evaluations for all the SH basis 
functions at each s d . The s d are organized in hierarchical bins 
formed by refining an initial icosahedron with 1— >2 bisection into 
equal-area spherical triangles (1— >4 subdivision does not lead to 
equal area triangles on the sphere as it does in the plane). We use 



6 to 8 subdivision levels, creating 512 to 2048 bins. Every bin at 
each level of the hierarchy contains a list of the s d within it. 

In the first pass, for each p e 0 , we cast shadow rays in the 
hemisphere about p’s normal Np, using the hierarchy to cull 
directions outside the hemisphere. We tag each direction s d with 
an occlusion bit, 1 - V p (s d ) , indicating whether s d is in the hemi- 
sphere and intersects 0 again (i.e., is self-shadowed by 0). An 
occlusion bit is also associated with the hierarchical bins, indicat- 
ing whether any s 6 within it is occluded. Self-occluded directions 
and bins are tagged so that we can perform further interreflection 
passes on them; completely unoccluded bins/samples receive only 
direct light from the environment. 

For diffuse surfaces, at each point p e 0 we further compute the 
transfer vector by SH-projecting M p from (10). For glossy sur- 
faces, we compute the transfer matrix by SH-projecting M p from 
(11). In either case, the result represents the radiance collected at 
p, parameterized by L. SH-projection to compute the transfers is 
performed by numerical integration over the direction samples s A 
summing into an accumulated transfer using the following rules: 



diffuse: 


+ = (/>,/») V,M H„(s d ) y,(s d ) 


glossy: 


(M p )l += V p (s d ) yj (s d )y,(s d ) 


Transfer integration over s^ fshadow pass, iteration 01 



The superscript 0 refers to the iteration number. The vector M p or 
matrix M at each point p is initialized to 0 before the shadow 
pass, whicn then sums over all s d at every p. The rules are derived 
using equation (1) for diffuse transfer integration, and equation 
(7) for glossy transfer integration. 

Later interreflection passes traverse the 
bins having the occlusion bit set during 
the shadow pass. Instead of shadow 
rays, they shoot rays that return trans- 
fer from exiting illumination on 0. If 
the ray (p,s d ) intersects another point 
q e 0 (where q is closest to p), we sample the radiance exiting 
from q in the direction -s d . The following update rules are used, 
where the superscript b is the bounce pass iteration: 



diffuse: 


(M„i + = (/>,/*) (l - W) (M d t' HAS,) 


(X,)*+= {\~V p (s d )) 

glossy, (G* ) t (M,)l''y t (reflect AT,)) 




Transfer integration over s* rinterreflection passes, iteration b\ 



As in the shadow pass, we begin by initializing transfer vectors or 
matrices to 0 before accumulating transfer over directions s d . The 
diffuse rules are derived from the definition of T DJ and equation 
(1); glossy rules from the definition of T CI and equations (6) and 
(7). The middle factor in the glossy transfer definition represents 
radiance emanating from q back to p from the previous bounce 
pass, b- 1. Since M stores incident radiance, it must be con- 
volved with 0’s BRDF at q to obtain exiting radiance in the 
~s d direction, yielding a summation over L Recall that a k is the 
k-Xh convolution coefficient, expressed in singly-indexed notation. 
The “reflect” operator simply reflects its first vector argument 
with respect to its second. We observe that equation (7) implies 
(M p )ij is a symmetric matrix for shadowed glossy transfer since it 
is formed by the product of two spherical functions; this is untrue 
for interreflected glossy transfer. 

Interreflection passes are repeated until the total energy of a given 
pass falls below a threshold. For typical materials, it diminishes 
quite rapidly. The sum of transfers from all bounce passes then 







accounts for interreflections. Our implementation simulates 
diffuse and glossy transfer at the same time. 

A simple enhancement to this simulation allows mirror-like 
surfaces within 0. We do not record transfers on such surfaces. 
Instead, a ray striking a mirrored surface is always reflected and 
then propagated until a non-mirrored surface is reached. Thus our 
paths at successive iterations can be represented as (L[SJ p> 
L[S]'D[S]'p , L[S]'D[S]'D[S]'p, etc.), where D is a diffuse or 
glossy bounce and S is a specular one. This captures caustics 
onto diffuse or glossy receivers that respond dynamically to 
lighting change (Figure 9). 



6. Run-time Rendering of Radiance Transfer 

We now have a model 0 capturing radiance transfer at many 

points p over its surface, represented as vectors or matrices. 

Rendering 0 requires the following steps at run-time: 

1 . compute incident lighting {L^} at one or more sample points 
Pi near 0 in terms of the SH basis, 

2. rotate these L Pi to 0’s coordinate frame and blend them (see 
below) to produce a field of incident lighting L p over 0, and 

3. perform a linear transformation on {L p )j at each point p on 0 
to obtain exit radiance. This requires a dot product with ( M p )i 
for diffuse surfaces (equation (8)), or a matrix-vector 
multiplication with (M p ) 0 for glossy surfaces (equation (9)). 

4. Glossy surfaces need a final step in which the radiance vector 
resulting from step 3 is convolved with 0’s BRDF at /?, and 
then evaluated at the view-dependent reflection direction R. 



Step 1 can load a precomputed environment map, evaluate ana- 
lytic lighting models in software, or sample radiance using 
graphics hardware. Rotation for Step 2 is outlined in Section 3, 
and is done once per object, not for each p . It is necessary be- 
cause transfer is stored using a common coordinate system for 0. 
If 0 is rigidly moving, it is more efficient to rotate the few radi- 
ance samples in L Pi to align with 0 than it is to rotate 0’s many 
transfer functions. We currently perform this rotation in software. 
For diffuse surfaces, a simple implementation of step 3 is to store 
the transfer vector per vertex and perform the dot product in a 
vertex shader. The transfer vectors can also be stored in texture 
maps rather than per-vertex and evaluated using a pixel shader. 
Since the coefficients are signed quantities not always in the [-1,1] 
range, DirectX 8.1 pixel shaders (VI. 4) or their OpenGL counter- 
part (extension by ATI) must be used, since they provide a larger 
range of [-8,8]. Our pixel shader needs 8 instructions to perform 
the dot-product and stores L P s coefficients in constant registers. 



For colored environments or simulation of color bleeding on 0, 
three passes are required, each performing a separate dot-product 
for the r, g, and b channels. Otherwise a single pass suffices. 

For glossy self-transfer, we perform the matrix transform from 
equation (9) in software because the transfer matrix is too big to 
be manipulated in either current vertex or pixel shaders. The 
result is (£'), the SH coefficients of transferred radiance at points 
p over 0. Then in a pixel shader, we perform a convolution with 
a simple cosine-power (Phong lobe) kernel for G and evaluate 
the result in the reflection direction R. The result can be written 



x«.c;(xw,), 

i-i w* 1 / 



y t (R) 



( 12 ) 



We evaluate SH-projections up to n = 5 on graphics hardware. 



6.1 Spatial Sampling of the Incident Radiance Field 
A simple and useful approach for dynamically sampling incident 
radiance is to sample it at 0’s center point. To handle local 
lighting variation over 0, a more accurate technique samples 
incident lighting at multiple points (Figure 5). A good set of 
sample points can be obtained using the ICP (iterated closest 




(a) single sample (b) ICP points (c) multiple samples 



Figure 5: ICP can be used to precompute good locations for sampling 
the incident radiance field over an object. Note the improved locality of 
lighting in (c) compared to (a) when the lighting is sampled at the 8 
points in (b) rather than at the object center. 

point) algorithm [28] as a preprocess, given a desired number of 
points as input. This produces a representative set of points P, 
near 0 and distributed uniformly over it where incident lighting 
can be sampled at run-time. We can also precompute coefficients 
at each p over 0 that blend contribution from each of the resulting 
sampled radiance spheres L Pi to produce an incident radiance field 
over 0, denoted previously by L p . 

6.2 Sampling SH Radiance on Graphics Hardware 
Graphics hardware is useful to capture the radiance samples {L Pi } 
in a dynamic scene. To do this, 6 images are rendered from each 
Pi corresponding to the 6 faces of the cube map spherical parame- 
terization. 0 itself should be removed from these renderings. 
Cube map images can then be projected to their SH coefficients 
using the integral in equation (1), as was done in [4]. 

For efficiency, we precompute textures for the basis functions 
weighted by differential solid angle, /?*($) = y?(s)ds(s ) , each 
evaluated over the cube map parameterization for s. The result- 
ing integral then becomes a simple dot product of the captured 
samples of Lj^s) with the textures B™ ( 5 ) . 

Ideally, this computation would be performed on the graphics 
hardware. Precision issues and inability to do inner products in 
hardware force us to read back the sampled radiance images and 
project them in software. In this case, it is important to reduce 
the resolution of read-back images as much as possible. 

Low-order SH projection can be computed with very low- 
resolution cube maps, assuming they have been properly bandlim- 
ited. For example, spherical signals already bandlimited to 6-th 
order can be projected using six 4x4 images with about 0.3% 
average-case squared error and about 1% worst-case squared 
error, where error is normalized by assuming unit-power signals 
(i.e., signals whose integrated square over the sphere is l). 1 For 
6x8x8 maps, this error reduces to 0.003% mean and 0.02% worst- 
case. Unfortunately, typical signals aren’t spherically bandlimited. 
Another analysis shows that, assuming continuous bilinear recon- 
struction over the sampled 2D images, projection to 6-th order 
using 6x8x8 images yields 0.7% and 2% average and worst-case 
squared error, while 6x16x16 yields 0.2% and 0.5% squared 
error, and 6x32x32 yields 0.05% and 0.1% squared error. 

We extract 6x16x16 images from the hardware. As is always true 
in point-sampled rendering, aliasing of the 2D images is still a 
problem because the above analysis uses bilinear reconstruction 
from point samples as the reference. To reduce aliasing, we 
supersample the cube map images by a factor of 2 in each dimen- 
sion, and do a box-filtered decimation in hardware before reading 
back and projecting. The basis function textures are also super- 
sampled and decimated in the same way as a preprocess. A 
radiance sample, including read-back and SH projection, takes 
about 1.16ms on a PIII-933 PC with an ATI Radeon 8500. 



1 More precisely, aveTage-case error is the integrated squared difference between the 
reference and reconstruction signals, averaged over all unit-power signals. Worst- 
case error is the same integrated error, but for the worst-case unit-power signal. 





Figure 6: Volumetric self-transfer captures how this cloud model 
shadows itself. Lighting can be changed in real-time (first row). The 
same model can also be placed in other environments (second row). 



7. Self-Transfer for Volumetric Models 

Self- transfer on volumetric data uses the same framework as 
surfaces. The resulting precomputed model allows run-time 
changes to the lighting, with correct shadowing and interreflec- 
tions in any low-frequency lighting environment (Figure 6). 

Our simple simulator currently works only for diffuse volumes. 
As with surface transfer, a preprocessing step simulates lighting 
on the volume using the SH basis functions as emitters. For 
shadowed transfer without interreflection (i.e., direct shadowing), 
we gather energy from the emitter to every voxel p of the volume, 
attenuated by its path through the volume. The required numeri- 
cal integration over directions s d can be expressed as 
(M p )1 + = A(p — » p+Dsj) y,(s t ) 
where A(p -» q) is the volume’s integrated attenuation along the 
path from p to q , and D is the distance until the ray (p,s d ) exits the 
volume. To include interreflections, we traverse every voxel p 
and forward-scatter its transfer along random directions s d . The 
transfer is deposited to all voxels q along s d until exiting the 
volume, using the rule 

More passes over the volume produce further indirect bounces. 
Rendering is performed in the traditional way: by drawing slices 
through the 3D volume in back to front order using alpha blending 
to account for transparency. Each slice is a 2D image containing 
samples of the transfer vector. A pixel shader computes the dot- 
product between the lighting’s coefficients and the transfer vec- 
tor’s required to shade each slice. 

8. Radiance Neighborhood-T ransfer 

Neighborhood-transfer precomputes an object O’s influence on 
its neighboring environment with respect to parameterized, low- 
frequency lighting. Transport simulation is identical to that for 
self-transfer in Section 5, but takes place with respect to points in 
a 3D space surrounding 0, not on it. At run-time, an arbitrary 
receiver R can be placed in this volume to capture shadows, 
reflections, and caustics cast by 0 onto R without knowing R in 
advance. For example, a moving vehicle 0 can cast shadows 
over a terrain R (Figure 7). Cast shadows and lighting also 
respond to lighting change; for example, moving the lights move 
soft shadows on R. This generalizes irradiance volumes [15] by 
accounting for glossy transfer and allowing dynamic lighting. 
Because R is unknown during the precomputation step, 0’s 
neighborhood volume must store a transfer matrix rather than a 
vector. This is true even for diffuse receivers, because we do not 
know in advance what R's normal will be. Our current imple- 
mentation precomputes the transfer matrix M p at each point 
within a simple 3D grid surrounding 0. At run-time, we perform 
the matrix transform from equation (9) in software at each point 




Figure 7: Neighborhood transfer captures how this hang glider blocks 
light to a volume of points below it. This allows cast soft shadow onto a 
bumpy terrain as the glider moves. 



in the volume and upload the result to the graphics hardware. The 
result is a volume texture containing coefficients of transferred 
radiance (L' ), which is applied to R. 

Then in a pixel shader this transferred radiance is used to light the 
receiver. A diffuse receiver convolves the radiance with the 
cosine weighted hemisphere H* using equation (6) and then 
evaluates the resulting SH projection at /?’ s normal vector. 
Glossy receivers perform equation (12). 

Receivers having precomputed self-transfer raise the difficulty 
that 0 and R do not share a common coordinate system. Thus, 
one of the two object’s dense set of transfer samples must be 
dynamically rotated to align with the other’s. The SH-rotation 
operation required is currently impractical for hardware evalua- 
tion. Improving hardware should soon ease this difficulty. 
Compared to self-transfer, neighborhood-transfer incurs some 
additional approximation errors. Cast shadow or light from 
multiple neighborhood -transfer objects onto the same receiver is 
hard to combine. Local lighting variation not due to 0 or IT’s 
presence is also a problem; lighting must be fairly constant across 
0’s entire neighborhood to provide accurate results. In particular, 
errors such as missing shadows will result when objects besides 0 
and R intrude into 0’s neighborhood. 0’s neighborhood must 
also be large enough to encompass any cast shadow or light it 
may cast on R . Nevertheless, neighborhood transfer captures 
effects impossible to obtain in real-time with previous methods. 

9. Results 

Full statistics are summarized in Table 1. We achieve real-time 
performance for all models except the transfer matrix ones (tea- 
pot, buddha, glider). For these models, multiplication with 25x25 
or 9x25 transfer matrices over the surface in software forms the 
bottleneck. Real-time results can be achieved even for these 
models after first fixing the light (allowing the view to be moved) 
or the view (allowing the light to be changed) and represent the 
second and third “render rate” entries in the table after slashes. 
The reason is that fixing either the view (which then fixes the 



model 


transfer 

type 


transfer 

shape 


transfer 

sampling 


preproc. 

time 


render 

rate 


head (fig 1) 


DS 


25 -M 


50k ver. mesh 


l.lh 


129 


bird (fig 4) 


DS 


25-M 


50k ver. mesh 


1.2h 


125 


ring (fig 9) 


DS 


25-M 


256x256 grid 


8m 


94 


buddhad (fig 11) 


DS 


25-M 


50k ver. mesh 


2.5h 


125 


buddha_g (fig 1 1) 


GS 


25x2 5- Ad 


50k ver. mesh 


2.5h 


3.6/16/125 


tyra_d (fig 1 1) 


DS 


25-M 


100k ver. mesh 


2.4h 


83 


tyra_g (fig 1 1) 


GS 


25x25- Ad 


100k ver. mesh 


2.4h 


2.2/9.4/83 


teapot (fig 5) 


GS 


25x25-Ad 


150k ver. mesh 


4.4h 


1.7/7.7/49 


cloud (fig 6) 


DV 


25-M 


32x32x32 vol. 


15m 


40 


glider (fig 7) 


N 


9x2 5 -Ad 


64x64x8 vol. 


3h ! 


4/120/4 


Table 1: Results. Transfer types are DS=diffuse surface self-transfer, 
GS=glossy surface self-transfer, DV=diffuse volume self-transfer, and 
N=neighborhood transfer. Timings are on a 2.2GHz Pentium 4 with ATI 
Radeon 8500 graphics card. Render rates are in Hz. 





reflection vector R) or the light L p reduces the computation in (12) 
to a simple dot product, which can then be done in hardware. 
Fixed light rendering is slower than fixed view because the fixed 
light mode requires evaluation of the SH basis at a changing view 
vector, followed by a dot product, while fixed view requires only 
the dot product and is identical in performance to diffuse transfer. 
Rendering quality can be judged from images in this paper (Fig- 
ures 1 and 3-12) all of which were computed with the PC graphics 
hardware. Self-shadowing and interreflection effects are convinc- 
ing and robust. No depth tolerances are required to prevent self- 
shadowing artifacts as they are with the standard shadow buffer 
technique [43]. Even when the lighting contains very high fre- 
quencies (Figure 9, top row of Figure 10, and Figure 12), pleasing 
images are produced without temporal artifacts but with some 
blurring of self-shadows; the result looks, and indeed is, identical 
to blurring the incident lighting. 

Figure 10 compares shadowing results across different SH orders. 
Small light sources (top row) require more bands; larger ones are 
approximated very well with fewer bands. Using up to the quartic 
band (fifth order with 25 coefficients) provides good results and is 
a good match to today’s graphics hardware. Note that quality is 
not dictated solely by how much energy is ignored in SH- 
projecting the lighting - diffuse and glossy objects are effective 
low-pass filters of incident radiance. With self-transfer effects 
though, the extent of this low-pass filtering depends on the ob- 
ject’s geometry, varies spatially, and typically requires more than 
third order (quadratics), unlike unshadowed diffuse transfer [34]. 
Because of its rotational invariance (equation (4)), we consider 
the SH basis especially useful for our low-frequency lighting 
application compared to alternatives like spherical wavelets [37]. 
When dynamically sampling incident radiance, this property 
eliminates aliasing which would otherwise produce temporal 
artifacts, like shading “wobble”, if projected to other bases with 
the same number of coefficients. Ringing or Gibbs phenomenon 
(oscillatory undershoot and overshoot in the reconstruction 
function) can be a problem when the lighting environment has 
significant energy near its highest represented band [10][42], We 
notice ringing artifacts only on very simple models such as the 
ones in Figures 9 and 10; artifacts are masked on complex 
meshes. Of course, reducing lighting frequency by attenuating 
higher frequency bands, called ‘^windowing”, also reduces ringing 
(see Figure 10, columns f and g). 

10. Conclusions and Future Work 

Much important shading variation over a complex object such as a 
human face is due to itself. Precomputed radiance self-transfer is 
a general method for capturing the occlusion and scattering effects 
an object has on itself with respect to any low-frequency lighting 
environment. When the actual incident lighting is substituted at 
run-time, the resulting model provides global illumination effects 
like soft shadows, interreflections, and caustics in real-time. 
Using graphics hardware, incident lighting can be sampled every 
frame and at multiple points near the object allowing dynamic, 
local lighting. Neighborhood-transfer generalizes the concept by 
recording transfer over 3D space, allowing cast soft shadows and 
caustics onto arbitrary receivers. 

In future work, we want to apply precomputed transfer to more 
sophisticated transport models, especially subsurface scattering 
[22]. We believe the smoothness of exiting radiance produced by 
this model makes it particularly suitable for SH-parameterized 
transfer. It would also be valuable to combine existing shadowing 
techniques with ours, by decomposing the scene’s lighting into 
high and low frequency terms. Compression of transfer fields is 
an important but unaddressed problem. Extension to deformable 
objects like human characters could be achieved by parameteriz- 



ing the precomputed self-transfer in the same way as the 
deformation, assuming the number of independent degrees of 
freedom remains manageable. Finally, we are interested in 
tracking fast- improving PC graphics hardware so that all compu- 
tation, including transfer matrix transforms and SH-rotation, may 
eventually be performed on the GPU. 
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Figure 9: Real-time, 

dynamic caustics and 
shadows using diffuse 
self-transfer. The ring 
model is rendered with a 
traditional environment 
map; the ground plane 
was rendered using pre- 
computed self-transfer 
results from a caustic 
simulation combining 
both the specular ring 
and a diffuse ground 
(Section 5). A 25 com- 
ponent transfer vector 
was recorded over a 
256x256 grid on the 
ground plane. These two 
images were generated 
by rotating an acquired 
environment around the 
model. A frame rate of 
130Hz is obtained in our 
implementation . 




(a) linear (b) quadratic (c) cubic (d) quartic* (e) quintic (f) high-order (g) windowed 
(rj=2) (n=3) (77=4) ( 77 = 5 ) (77=6 ) (tt=26) ( 77=26) (h) ray traced 



Figure 10: Comparison of SH orders for representing diffuse self-transfer. Shadows are cast onto a ground plane from a 
single polygon seen obliquely at bottom. Angular radius of a constant circular light used for illumination is shown at left. 
Higher-orders provide greater accuracy for small lights, but give rise to ringing (which is reduced by windowing in g). Note that 
n 2 coefficients are required for projection order n. We use the quartic projection (starred) for other result images in this paper. 





(a) diffuse, unshadowed (b) diffuse, interreflected (c) glossy, unshadowed (d) glossy, interreflected 

Figure 11: Diffuse and Glossy Self-transfer. Unshadowed transfer (a,c) includes no global transport effects. Interreflected transfer 
(b,d) includes both shadows and interreflections. 




(a) no interreflections (shadowed transfer) (b) 1 -bounce interreflections (c) 2-bounce interreflections 

Figure 12: Interreflections in Self-Transfer. Top row shows diffuse transfer; bottom row shows glossy transfer. Note the reflections 
under the knob on the lid and from the spout onto the body. Run-time performance is insensitive to interreflections; only the preproc- 
essed simulation must include additional bounces. Further bounces after the first or second typically provide only subtle change. 




